Leveraging mathematical models of disease dynamics and machine learning to improve development of novel malaria interventions

Background Substantial research is underway to develop next-generation interventions that address current malaria control challenges. As there is limited testing in their early development, it is difficult to predefine intervention properties such as efficacy that achieve target health goals, and therefore challenging to prioritize selection of novel candidate interventions. Here, we present a quantitative approach to guide intervention development using mathematical models of malaria dynamics coupled with machine learning. Our analysis identifies requirements of efficacy, coverage, and duration of effect for five novel malaria interventions to achieve targeted reductions in malaria prevalence. Methods A mathematical model of malaria transmission dynamics is used to simulate deployment and predict potential impact of new malaria interventions by considering operational, health-system, population, and disease characteristics. Our method relies on consultation with product development stakeholders to define the putative space of novel intervention specifications. We couple the disease model with machine learning to search this multi-dimensional space and efficiently identify optimal intervention properties that achieve specified health goals. Results We apply our approach to five malaria interventions under development. Aiming for malaria prevalence reduction, we identify and quantify key determinants of intervention impact along with their minimal properties required to achieve the desired health goals. While coverage is generally identified as the largest driver of impact, higher efficacy, longer protection duration or multiple deployments per year are needed to increase prevalence reduction. We show that interventions on multiple parasite or vector targets, as well as combinations the new interventions with drug treatment, lead to significant burden reductions and lower efficacy or duration requirements. Conclusions Our approach uses disease dynamic models and machine learning to support decision-making and resource investment, facilitating development of new malaria interventions. By evaluating the intervention capabilities in relation to the targeted health goal, our analysis allows prioritization of interventions and of their specifications from an early stage in development, and subsequent investments to be channeled cost-effectively towards impact maximization. This study highlights the role of mathematical models to support intervention development. Although we focus on five malaria interventions, the analysis is generalizable to other new malaria interventions. Graphical abstract Supplementary Information The online version contains supplementary material available at 10.1186/s40249-022-00981-1.


Table of contents
S1.1. In the sections below, an overview of the calibration and simulation settings used for this 227 study has been provided.  -Determined by EIR which is a model input and affects the force of infection in the simulated setting -Considers an age-dependent exposure of human hosts to mosquitoes (correlating with body-surface area) -The relationship between infection rates and EIR is defined and fitted with data from The Gambia, Nigeria and Kenya in (26) Infection progression in humans: asexual parasite densities and immunity (10,15,26,27) and eq. 5-15 of Additional file 1 in (15) -Blood-stage parasite density depends on the time since infection and is affected by naturally acquired immunity. Acquired immunity reduces parasite density of subsequent infections. -The duration of infection follows a log-normal distribution and is estimated from a malaria therapy dataset ((27) and eq. 1 in (10)) -Immunity (both pre-erythrocytic and blood-stage) develops progressively following consequent episodes of exposure to infection and total parasitemia seen by an individual in their lifetime. -Super-infection is possible with cumulative parasite densities -The parasite density in a host at a given time is defined and fitted with data from Ghana, Nigeria and Tanzania in (10) Transmission from infected humans to mosquitoes (11,15,28) and eq. 16-21 of Additional file 1 in (15) -Infectivity to mosquitoes depends on the density of parasites present in the human (including a time-lag for gametocyte development) -The fraction of resulting infected mosquitoes after feeding on a human host follows a binomial distribution -The relationship between infectivity to mosquitoes and parasite density was defined and fitted in (11) with data from malaria therapy collected in Georgia between 1940 and 1963 and available from (27) -The age-specific contribution to overall infectiousness to mosquitoes was validated in (11) against field data collected from Liberia, The Gambia, Tanzania, Kenya, Papua New Guinea and Cameroon.
Clinical illness, morbidity, mortality, and anemia (12,13,15,29) and eq. 22-32 of Additional file 1 in -Acute clinical illness depends on human host parasite densities and their pyrogenic threshold which evolves over time depending on the individual exposure history -Acute morbidity episodes can be uncomplicated or evolve to severe episodes; a proportion of the severe episodes leads to deaths -The probability of a clinical malaria episode was defined and fitted with data from Senegal in (13) -The probabilities that a clinical episode becomes severe and the risk of mortality for a severe episode are defined and fitted to field data from over 10 African countries in (12) Modelled characteristics of the transmission setting   Table 1 in (15), in Table 3 in (25), and in 264 Additional file 1: Table S1 in (16). A summary of the model parameters has been provided in 265 Additional file 1: Tables S1.2 and S1.3.

267
As summarized in the Methods section, the simulated human population size in this analysis was demographic surveillance site in Ifakara, Tanzania, available through the INDEPTH network (30).

270
For all simulations, it is assumed there were no imported infections during the entire study period.

272
Health system characteristics (Additional file 1:   vector characteristics were also included in the simulation specifications ( Fig. 1, Table 1).

310
The following intervention targets were defined in the transmission cycle ( Fig. 1): "anti-infective" 311 as acting at the liver stage and preventing occurrence of a new infection, "blood stage clearance" 312 as clearing blood-stage parasites by administration of a drug, "transmission blocking" as mosquitoes during different stages of their life cycle, for example, before a blood meal (pre-315 prandial killing) and/or after a blood meal (post-prandial killing). Furthermore, mosquitoes are 316 affected by vector control interventions according to their indoor and outdoor biting patterns.

318
The length of the intervention effect was described by half-life for exponential, sigmoidal, or 319 biphasic decay profiles, or by duration for step-like decay profiles. Generally, half-life refers to 320 the half-life of intervention efficacy decay, representing the time in which the initial intervention 321 efficacy has been reduced by 50% (Additional file 1: Fig. S2   in the present study. In seasonal, low-transmission settings (EIR < 2) a high proportion of 362 simulations reached elimination before any intervention was deployed and were removed from the 363 analysis (Additional file 1: Fig. S3.5). Since this happened for over 75% of simulations at EIR < 364 2, we did not investigate optimal intervention profiles for transmission settings with EIR < 2.

365
Arguably, for settings close to elimination, a different health goal, such as the probability of 366 elimination, would be more appropriate which is outside the scope of this study, which focuses on 367 reducing PfPR0-99.    For each intervention, we successively identified the minimum profiles of the intervention Gaussian process models are non-parametric models which define a prior probability distribution 436 over a collection of functions using a kernel, smoothing function. Precisely, given the relationship where y is the PfPR0-99 reduction here, and x represents the set of intervention parameters x1, …, 439 xn, the main assumption of a GP is that is the covariance matrix of the Gaussian distribution, is its mean, and K is a kernel function (40).

444
Once data are observed, the posterior probability distribution of the functions consistent with the 445 observed data can be derived, which is then used to infer outcomes at unobserved locations in the 446 parameter space (40). The intuition behind a GP model is based on the "smoothness" relationship 447 between its components. Accordingly, points which are close in the input parameter space will 448 lead to close points in the output space.  and Table S4.1). Precisely, the training set was split into 5 subsets and, iteratively, 4 of these 472 subsets were used for training the GP, while the remaining set was used as an out-of-sample test 473 set during the cross-validation procedure. After assessing the prediction error obtained during the 474 cross-validation procedure, the GP was trained using the entire training set.

476
Furthermore, since the trained GP model provides the mean and variance for each predicted output, where Y is the model outcome (in this case, PfPR0-99 reduction), d is the number of model inputs, 575 and the conditional variances defined as: with x1, …, xn representing the model input parameters.  Calculating the sensitivity indices defined above, the variance of the GP emulator output was thus 601 decomposed into proportions attributable to intervention characteristics, i.e., intervention efficacy, half-life, and deployment coverage, as well as access to care. Using the main effects, the relative 603 importance ri of each characteristic as a proxy for impact determinants was defined as follows:        Table   717 S2.1.

719
To solve the above optimization problem, a general nonlinear augmented Lagrange multiplier  Under the simulated levels of case management, before intervention deployment, in seasonal 735 settings, at low-transmission (simulated EIR < 2, corresponding simulated true PfPR2-10 < Fig. S3.5). For this reason, the space of obtained prevalence reductions following intervention 738 deployment was rather sparse and the obtained optima were not reliable and often did not 739 converge. Therefore, it was chosen to report minimum intervention profiles for settings with true 740 PfPR2-10 >= 11.7% (with RDTs this yields a patent PfPR2-10 >= 5.8%). perennial) and three types of mosquito biting patterns (low, medium. and high indoor biting).

756
The mosquito biting patterns had little to no effect on the results of the sensitivity analysis for